GenXHC: a probabilistic generative model for cross-hybridization compensation in high-density genome-wide microarray data

نویسندگان

  • Jim C. Huang
  • Quaid Morris
  • Timothy R. Hughes
  • Brendan J. Frey
چکیده

MOTIVATION Microarray designs containing millions to hundreds of millions of probes that tile entire genomes are currently being released. Within the next 2 months, our group will release a microarray data set containing over 12,000,000 microarray measurements taken from 37 mouse tissues. A problem that will become increasingly significant in the upcoming era of genome-wide exon-tiling microarray experiments is the removal of cross-hybridization noise. We present a probabilistic generative model for cross-hybridization in microarray data and a corresponding variational learning method for cross-hybridization compensation, GenXHC, that reduces cross-hybridization noise by taking into account multiple sources for each mRNA expression level measurement, as well as prior knowledge of hybridization similarities between the nucleotide sequences of microarray probes and their target cDNAs. RESULTS The algorithm is applied to a subset of an exon-resolution genome-wide Agilent microarray data set for chromosome 16 of Mus musculus and is found to produce statistically significant reductions in cross-hybridization noise. The denoised data is found to produce enrichment in multiple gene ontology-biological process (GO-BP) functional groups. The algorithm is found to outperform robust multi-array analysis, another method for cross-hybridization compensation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GenRate: A Generative Model that Reveals Novel Transcripts in Genome-Tiling Microarray Data

Genome-wide microarray designs containing millions to hundreds of millions of probes are available for a variety of mammals, including mouse and human. These genome tiling arrays can potentially lead to significant advances in science and medicine, e.g., by indicating new genes and alternative primary and secondary transcripts. While bottom-up pattern matching techniques (e.g., hierarchical clu...

متن کامل

GenRate: A Generative Model That Finds and Scores New Genes and Exons in Genomic Microarray Data

Recently, researchers have made some progress in using microarrays to validate predicted exons in genome sequence and find new gene structures. However, current methods rely on separately making threshold-based decisions on intensity of expression, similarity of expression profiles, and arrangements of exons in the genome. We have taken a Bayesian approach and developed GenRate, a generative mo...

متن کامل

Finding Novel Transcripts in High-Resolution Genome-Wide Microarray Data Using the GenRate Model

Genome-wide microarray designs containing millions to tens of millions of probes will soon become available for a variety of mammals, including mouse and human. These “tiling arrays” can potentially lead to significant advances in science and medicine, e.g., by indicating new genes and alternative primary and secondary transcripts. While bottomup pattern matching techniques (e.g., hierarchical ...

متن کامل

Investigation on metabolism of cisplatin resistant ovarian cancer using a genome scale metabolic model and microarray data

Objective(s): Many cancer cells show significant resistance to drugs that kill drug sensitive cancer cells and non-tumor cells and such resistance might be a consequence of the difference in metabolism. Therefore, studying the metabolism of drug resistant cancer cells and comparison with drug sensitive and normal cell lines is the objective of this research. Material and Methods:Metabolism of c...

متن کامل

Broadening Gene Pool of Rice for Resistance to Biotic Stresses Through Wide Hybridization

Variability in the cultivated germplasm for economic traits such as resistance to rice tungro virus, sheathblight, yellow stem borer, drought and salt tolerance is limited. This necessitated search for the genes in secondary and tertiary gene pool of genus Oryza. Fortunately, wild species are an important reservoir ofuseful genes for resistance to major disease, pest and tolerance t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 21 Suppl 1  شماره 

صفحات  -

تاریخ انتشار 2005